10  Import/export data

Author

Vladimir Buskin

10.2 Preparation

The first section of an R-script always specifies the libraries that are needed for executing the code to follow. In this unit, we will need readxl and writexl to aid us with importing MS Excel files.

library(readxl)
library(writexl)

Simply copy the code lines above into your script and execute them.

10.2.1 Exporting data

Assume we’d like to save our data frame with word frequencies to a local folder on our system. Let’s briefly regenerate it:

lemma <- c("start", "enjoy", "begin", "help")

frequency <- c(418, 139, 337, 281)

data <- data.frame(lemma, frequency)

print(data)
  lemma frequency
1 start       418
2 enjoy       139
3 begin       337
4  help       281

There are two common formats in which tabular data can be stored on your computer:

  • in .csv-files (‘comma-separated values’; native format of LibreOffice)

  • .xls/.xlsx-files (Microsoft Excel files)

To save our data data frame in .csv-format, we can use the write_table() function:

write.csv(data, "frequency_data.csv")

The file is now stored at the location of your current R-script. You can open this file …

  • in LibreOffice

  • in Microsoft Excel via File > Import > CSV file > Select the file > Delimited and then Next > Comma and Next > General and Finish.

Clearly, opening CSV files in MS Excel is quite cumbersome, which is why it’s better to export it as an Excel file directly.

We use write_xlsx() provided by the package writexl package:

write_xlsx(data, "frequency_data.xlsx")

The file is now stored at the location of your current R-script. You should be able to open it in MS Excel without any issues.

10.2.2 Importing data

Let’s read the two files back into R.

To import the CSV file, we can use the read.csv() function:

imported_csv <- read.csv("frequency_data.csv")
print(imported_csv)
  X lemma frequency
1 1 start       418
2 2 enjoy       139
3 3 begin       337
4 4  help       281

To get rid of the column with the row names, we’ll use some subsetting:

imported_csv <- imported_csv[,-1] # delete first column

For importing the Excel file, we’ll use the read_xlsx() function from the readxl package:

imported_excel <- read_xlsx("frequency_data.xlsx")
print(imported_excel)
# A tibble: 4 × 2
  lemma frequency
  <chr>     <dbl>
1 start       418
2 enjoy       139
3 begin       337
4 help        281

That’s it! Nevertheless, remember to always check your imported data to ensure it has been read correctly, especially when working with CSV files.

10.2.3 Exercises

Winter, Bodo. 2020. Statistics for Linguists: An Introduction Using r. New York; London: Routledge.